Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm

نویسندگان

  • José María Luna
  • José Raúl Romero
  • Cristóbal Romero
  • Sebastián Ventura
چکیده

The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest among patterns. Despite the mining of association rules being an area of great interest for many researchers, the search for well-grouped continuous values is still a challenge, discovering rules that do not comprise patterns which represent unnecessary ranges of values. Existing algorithms for mining association rules in continuous domains are mainly based on a non-deterministic search, requiring a high number of parameters to be optimised. These parameters hinder the mining process, and the algorithms themselves must be known to those data mining experts that want to use them. We therefore present a grammar guided genetic programming algorithm that does not require as many parameters as other existing approaches and enables the discovery of quantitative association rules comprising small-size gaps. The algorithm is verified over a varied set of data, comparing the results to other association rule mining algorithms from several paradigms. Additionally, some resulting rules from different paradigms are analysed, demonstrating the effectiveness of our model for reducing gaps in numerical features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Mathematical Programming Model and Genetic Algorithm for a Multi-Product Single Machine Scheduling Problem with Rework Processes

In this paper, a multi-product single machine scheduling problem with the possibility of producing defected jobs, is considered. We concern rework in the scheduling environment and propose a mixed-integer programming (MIP) model for the problem.  Based on the philosophy of just-in-time production, minimization of the sum of earliness and tardiness costs is taken into account as the objective fu...

متن کامل

A parameter-tuned genetic algorithm for vendor managed inventory model for a case single-vendor single-retailer with multi-product and multi-constraint

This paper develops a single-vendor single-retailer supply chain for multi-product. The proposed model is based on Vendor Managed Inventory (VMI) approach and vendor uses the retailer's data for better decision making. Number of orders and available capital are the constraints of the model. In this system, shortages are backordered; therefore, the vendor’s warehouse capacity is another limitati...

متن کامل

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

Mining association rules with single and multi-objective grammar guided ant programming

This paper treats the first approximation to the extraction of association rules by employing ant programming, a technique that has recently reported very promising results in mining classification rules. In particular, two different algorithms are presented, both guided by a context-free grammar, specifically suited to association rule mining, which defines the search space. The first proposal...

متن کامل

Improved Genetic Algorithm Approach for Sensitive Association Rules Hiding

Association rule mining is interesting area of data mining research which discovers correlations between different item sets in a transaction database. Efforts have been made for efficient hiding of sensitive association rules, but these techniques do not consider the consequences such as loss of information, lost rules and increase in ghost rules production. In this paper, we propose improved ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Integrated Computer-Aided Engineering

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2014